My data
I decided to go a different direction, and with new data. CITES is an organization that records data pertaining to international transactions of wild plants and animals. I downloaded the data on all plants in the family Orchidaceae (Orchids) since 1975 (the earliest year in their records). I thought this data would be interesting to look at, as it contains information on the orchid species (or at least the genus), as well as where it came from and where it went (along with some other data). I’ve made a graph showing the top 10 most popular importers into the United States, by totaling counts of recorded imports over the past four years.
Other Plans For the future, I might look at the most popular genuses to be exported from certain countries over time, or the ratio between imports and exports for certain countries, countries with highest overall importing and exporting, etc.
#New Data Alert!!
#I found some cool data about orchids on the IUCN Red List. This is an online database full of different species (not just orchids), their current extinction risks, and a whole lot of other useful information.
IUCN_assessments <- read_xlsx("~/Desktop/redlist_orchid_data/assessments.xlsx")
IUCN_taxonomy <- read_xlsx("~/Desktop/redlist_orchid_data/taxonomy.xlsx")
IUCN_orchids <- IUCN_assessments %>% inner_join(IUCN_taxonomy, by = c("internalTaxonId", "scientificName"))
#Can I somehow combine these tables? We're about to find out...
all_table <- CITES_orchid %>% inner_join(IUCN_orchids, by = c("Taxon" = "scientificName"))
#Some ideas for the future… #import/export map, globe with arrows connecting
#shiny app?
#use keywords to determine main causes of habitat loss, types of orchids more likely to be endangered
#submit pdf about project to #causeweb.org/usproc/
#Citing the paper which uses similar data as you: #https://academic.oup.com/botlinnean/article/186/4/435/4736317
#Reading in the country code data
country_codes <- read.csv("https://pkgstore.datahub.io/core/country-list/data_csv/data/d7c9d7cfb42cb69f4422dec222dbbaa8/data_csv.csv")
#Joining country code data
all_table <- all_table %>%
left_join(country_codes, by = c("Importer" = "Code"))
## Warning: Column `Importer`/`Code` joining character vector and factor,
## coercing into character vector
all_table <- all_table %>%
left_join(country_codes, by = c("Exporter" = "Code"))
## Warning: Column `Exporter`/`Code` joining character vector and factor,
## coercing into character vector
#Renaming the columns
setnames(all_table, old=c("Name.x","Name.y"), new=c("importer_country", "exporter_country"))
#Next steps: #Visualize locations of orchids on a 3D globe – could use coordinate points from IUCN data, or aggregate the orchids into groups based on their threatened level in each country (scroll over a country to see how many IUCN near threatened, threatened, endangered, and critical orchids grow there). #Visualize import/export of orchids (by genus? subfamily?) based on magnitude for each country. This could be done on a globe or a flat map. Connecting lines/arrows for each shipment, with width varying depending on the number of plants/two circles or a bar chart to indicate import and export of plants, based on total number of shipments or total number of orchids. #Search “rationale” column in IUCN assessments data (now part of all_table) for keywords or phrases such as “habitat decline,” “threat from agriculture” to show trends in the cause of species decline. #Search “habitat” column for keywords relating to growth habit: “epiphytic” (tree-dwelling), “terrestrial” (soil-dwelling) or “epilithic” (rock-dwelling), to look for links between the growth habit and the natural abundance of plants in the face of changing landscapes.
# Reading in IUCN's point dataset, which contains information
# regarding locations of plants over many genuses
points <- read_xlsx("~/Desktop/redlist_orchid_points/points_data.xlsx")
# Joining the points dataset with IUCN_assessments to include
# red list categories.
points_plus <- points %>%
left_join(IUCN_orchids, by = c("binomial" = "scientificName")) %>%
select(binomial, longitude, latitude, redlistCategory, realm) %>%
filter(redlistCategory != "Data Deficient")
# Ordering levels so they can be colored on the globe
points_plus$redlistCategory <-
factor(points_plus$redlistCategory,
levels = c(
"Least Concern",
"Near Threatened",
"Vulnerable",
"Endangered",
"Critically Endangered",
"Extinct"
))
# Associating colors with redlist levels
colors <- c(
"Least Concern" = "#3D92FF",
"Near Threatened" = "#81D3FF",
"Vulnerable" = "#F9FF96",
"Endangered" = "#FDC650",
"Critically Endangered" = "#FF752C",
"Extinct" = "#FF361E"
)
# Load threejs library
library(threejs)
## Loading required package: igraph
##
## Attaching package: 'igraph'
## The following objects are masked from 'package:dplyr':
##
## as_data_frame, groups, union
## The following objects are masked from 'package:purrr':
##
## compose, simplify
## The following object is masked from 'package:tidyr':
##
## crossing
## The following object is masked from 'package:tibble':
##
## as_data_frame
## The following objects are masked from 'package:stats':
##
## decompose, spectrum
## The following object is masked from 'package:base':
##
## union
#earth image
earth <-
"http://eoimages.gsfc.nasa.gov/images/imagerecords/73000/73909/world.topo.bathy.200412.3x5400x2700.jpg"
# Making the globe with globejs
globejs(
img = earth,
lat = points_plus$latitude,
long = points_plus$longitude,
color = colors
)
## Input to asJSON(keep_vec_names=TRUE) is a named vector. In a future version of jsonlite, this option will not be supported, and named vectors will be translated into arrays instead of objects. If you want JSON object output, please use a named list instead. See ?toJSON.
#Country data was sourced from datahub.io